Search CORE

710 research outputs found

One-Class Support Measure Machines for Group Anomaly Detection

Author: Muandet Krikamol
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2013
Field of study

We propose one-class support measure machines (OCSMMs) for group anomaly detection which aims at recognizing anomalous aggregate behaviors of data points. The OCSMMs generalize well-known one-class support vector machines (OCSVMs) to a space of probability measures. By formulating the problem as quantile estimation on distributions, we can establish an interesting connection to the OCSVMs and variable kernel density estimators (VKDEs) over the input space on which the distributions are defined, bridging the gap between large-margin methods and kernel density estimators. In particular, we show that various types of VKDEs can be considered as solutions to a class of regularization problems studied in this paper. Experiments on Sloan Digital Sky Survey dataset and High Energy Particle Physics dataset demonstrate the benefits of the proposed framework in real-world applications.Comment: Conference on Uncertainty in Artificial Intelligence (UAI2013

arXiv.org e-Print Archive

MPG.PuRe

Comment on "Support Vector Machines with Applications"

Author: Bousquet Olivier
Schölkopf Bernhard
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/08/2006
Field of study

Comment on ``Support Vector Machines with Applications'' [math.ST/0612817]Comment: Published at http://dx.doi.org/10.1214/088342306000000484 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems

Author: Mehrjou Arash
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2018
Field of study

Filtering is a general name for inferring the states of a dynamical system given observations. The most common filtering approach is Gaussian Filtering (GF) where the distribution of the inferred states is a Gaussian whose mean is an affine function of the observations. There are two restrictions in this model: Gaussianity and Affinity. We propose a model to relax both these assumptions based on recent advances in implicit generative models. Empirical results show that the proposed method gives a significant advantage over GF and nonlinear methods based on fixed nonlinear kernels

arXiv.org e-Print Archive

MPG.PuRe

The representer theorem for Hilbert spaces: a necessary and sufficient condition

Author: Dinuzzo Francesco
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2012
Field of study

A family of regularization functionals is said to admit a linear representer theorem if every member of the family admits minimizers that lie in a fixed finite dimensional subspace. A recent characterization states that a general class of regularization functionals with differentiable regularizer admits a linear representer theorem if and only if the regularization term is a non-decreasing function of the norm. In this report, we improve over such result by replacing the differentiability assumption with lower semi-continuity and deriving a proof that is independent of the dimensionality of the space

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Submodular Inference of Diffusion Networks from Multiple Trees

Author: Rodriguez Manuel Gomez
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2012
Field of study

Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapidly, we only observe a relatively small set of cascades before a network changes significantly. Scalable network inference based on a small cascade set is then necessary for understanding the rapidly evolving dynamics that govern diffusion. In this article, we develop a scalable approximation algorithm with provable near-optimal performance based on submodular maximization which achieves a high accuracy in such scenario, solving an open problem first introduced by Gomez-Rodriguez et al (2010). Experiments on synthetic and real diffusion data show that our algorithm in practice achieves an optimal trade-off between accuracy and running time.Comment: To appear in the 29th International Conference on Machine Learning (ICML), 2012. Website: http://www.stanford.edu/~manuelgr/network-inference-multitree

arXiv.org e-Print Archive

CiteSeerX

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Causal Inference on Discrete Data using Additive Noise Models

Author: Janzing Dominik
Peters Jonas
Schölkopf Bernhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/11/2009
Field of study

Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that whenever the joint distribution \prob^{(X,Y)} admits such a model in one direction, e.g. Y=f(X)+N, N \independent X, it does not admit the reversed model X=g(Y)+\tilde N, \tilde N \independent Y as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. In an extensive experimental study we show that this algorithm works both on synthetic and real data sets

arXiv.org e-Print Archive

MPG.PuRe

Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions

Author: Schölkopf Bernhard
Simon-Gabriel Carl-Johann
Publication venue
Publication date: 01/01/2018
Field of study

Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures

\mu

from some set

M

to functions in a reproducing kernel Hilbert space (RKHS) with kernel

k

. The RKHS distance of two mapped measures is a semi-metric

d_k

over

M

. We study three questions. (I) For a given kernel, what sets

M

can be embedded? (II) When is the embedding injective over

M

(in which case

d_k

is a metric)? (III) How does the

d_k

-induced topology compare to other topologies on

M

? The existing machine learning literature has addressed these questions in cases where

M

is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on measures only. In particular, we systemise and extend various (partly known) equivalences between different notions of universal, characteristic and strictly positive definite kernels, and show that on an underlying locally compact Hausdorff space,

d_k

metrises the weak convergence of probability measures if and only if

k

is continuous and characteristic.Comment: Old and longer version of the JMLR paper with same title (published 2018). Please start with the JMLR version. 55 pages (33 pages main text, 22 pages appendix), 2 tables, 1 figure (in appendix

arXiv.org e-Print Archive

Publikationsserver der Universität Tübingen

MPG.PuRe